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ABSTRACT 



We describe a method for constructing mock galaxy catalogues which are well 
suited for use in conjunction with large photometric surveys. We use the semi-analytic 
I— I galaxy formation model of Bower et al. implemented in the Millennium N-body simu- 

4^ lation of the evolution of dark matter clustering in a ACDM cosmology. We apply our 

method to the specific case of the surveys soon to commence with PSl, the first of 4 
Q telescopes planned for the Pan-STARRS system. PSl has 5 photometric bands, g, r, i, z 

;-H and y and will carry out an all-sky "Stt" survey and a medium deep survey (MDS) 

over 84 sq. deg. We calculate the expected magnitude limits for extended sources in 
the two surveys. We find that, after 3 years, the Stt survey will have detected over 10® 
galaxies in all 5 bands, 10 million of which will lie at redshift z > 0.9, while the MDS 
will have detected over lO*" galaxies with 0.5 million lying at ^ > 2. These numbers at 
^ least double if detection in the shallowest band, y is not required. We then evaluate 

the accuracy of photometric redshifts estimated using an off-the-shelf photo- z code. 
With the grizy bands alone it is possible to achieve an accuracy in the Stt survey of 
^z/{l + z) ^ 0.06 in the range 0.25 < z < 0.8, which could be reduced by about 
CN 15% using near infrared photometry from the UKIDDS survey, but would increase 

by about 25% for the deeper sample without the y band photometry. For the MDS 
an accuracy of Az/(1 + z) ^ 0.05 is achievable for 0.02 < z < 1.5 using grizy. A 
dramatic improvement in accuracy is possible by selecting only red galaxies. In this 
O case, Az/{1 + z) ^ 0.02 - 0.04 is achievable for --100 million galaxies at 0.4 < z < 1.1 

ILJ in the Stt survey and for 30 million galaxies in the MDS at 0.4 < z < 2. We investi- 

gate the effect of using photometric redshifts in the estimate of the baryonic acoustic 
oscillation scale. We find that PSl will achieve a similar accuracy in this estimate as 
?H a spectroscopic survey of 20 million galaxies. 

d 

Key words: cosmology: large-scale structure of the Universe - cosmology: cosmo- 
logical parameters 



1 INTRODUCTION 

Studies of the cosmic large structure were brought to a new 
level by the two large galaxy surveys of the past decade, 
the "2-degree-fieId galaxy redshift survey (2dFGRS; Col- 
Icss ct al. 2001) and the Sloan Digital Sky Survey (SDSS 
York ct al. 2000). The former relied on photographic plates 
for its source catalogue while the latter was compiled from 
the largest CCD based photometric survey to date. Most of 
the large-scale structure studies carried out with these sur- 
veys made use of spectroscopic redshifts, for about 220,000 
galaxies in the case of the 2dFGRS and 585,719 galaxies 
in the case of the SDSS (Strauss et al. 2002). These surveys 
achieved important advances such as the confirmation of the 



existence of dark energy (Efstathiou et al. 2002; Tegmark 
et al. 2004) and the discovery of baryonic acoustic oscilla- 
tions (Pcrcival ct al. 2001; Cole et al. 2005; Eiscnstcin ct al. 
2005). Yet, a number of fundamental questions on the cosmic 
large-scale structure remain unanswered, such as the iden- 
tity of the dark matter and the nature of the dark energy. 

Further progress in the subject is likely to require a 
new generation of galaxy surveys at least one order of mag- 
nitude larger than the 2dFGRS and the SDSS. Unfortu- 
nately, measuring redshifts for millions of galaxies is infeasi- 
ble with current instrumentation. Attention has therefore 
shifted to the possibility of carrying out extremely large 
surveys of galaxies in which, instead of using spectroscopy, 
redshifts are estimated from deep multi-band photometry. 
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Although the accuracy of these estimates is limited, this 
strategy can yield measurements for hundreds of millions 
of galaxies or more. Several instruments are currently being 
planned to carry out such a programme. The most advanced 
is the Panoramic Survey Telescope & Rapid Response Sys- 
tem (Chambers 2006). Of an eventual 4 telescopes for this 
system, the first one, PSl, is now in its final commissioning 
stages and is expected to begin surveying the sky early in 
2009. This telescope is likely to be quickly followed by the 
full Pan-STARRS system and by the Large Synoptic Survey 
Telescope (LSST Tyson 2002). Several other smaller photo- 
metric surveys are currently underway (UKIDSS, Megacam 
etc. Lawrence ct al. 2007; Boulade ct al. 2003). 

One of the important lessons learned from previous sur- 
veys, including 2dFGRS and SDSS, is the paramount impor- 
tance of careful modelling of the survey data for the extrac- 
tion of robust astrophysical results. Such modelling is best 
achieved using large cosmological simulations to follow the 
growth of structure in a specified cosmological background. 
The simulations can be used to create mock versions of the 
real survey in which the geometry and selection effects are 
reproduced. Such mock surveys allow a rigorous assessment 
of statistical and systematic errors, aid in the design of new 
statistical analyses and enable the survey results to be di- 
rectly related to cosmological theory. Mock catalogues based 
on cosmological simulations were first used in the 1980s, in 
connection with the CfA galaxy survey (Davis et al. 1985; 
White ct al. 1988) and redshift surveys of IRAS galaxies 
(Saunders et al. 1991) and have been extensively deployed 
for analyses of the 2dFGRS and the SDSS (Cole et al. 1998; 
Blaizot et al. 2006). 

The recent determination of the values of the cosmolog- 
ical parameters by a combination of microwave background 
and large-scale structure data (e.g. Sanchez et al. 2006; Ko- 
matsu et al. 2008) has removed one major layer of uncer- 
tainty in the execution of cosmological simulations. N-body 
techniques arc now sufficiently sophisticated that the evo- 
lution of the dark matter can be followed with impressive 
precision from the epoch of recombination to the present 
(Springel et al. 2005). The main uncertainty lies in calculat- 
ing the evolution of the baryonic component of the Universe. 

The size of the planned photometric surveys and the 
need to understand and quantify uncertainties in estimates 
of photometric redshifts and their consequences for diagnos- 
tics of large-scale structure pose novel challenges for the con- 
struction of mock surveys. The simulations need to be large 
enough to emulate the huge volumes that will be surveyed 
and, at the same time, the modelling of the galaxy popula- 
tion needs to be sufficiently realistic to allow an assessment 
of the uncertainties introduced by photometric redshifts. At 
present, the only technique that can satisfy both these two 
requirements is the combination of large N-body simulations 
with semi-analytic modelling of galaxy formation. 

Semi-analytic models of galaxy formation are able to 
follow the evolution of the baryonic component in a cosmo- 
logical volume by making a number of simplifying assump- 
tions, most notably that gas cooling into halos can be cal- 
culated in a spherically symmetric approximation (White & 
Frcnk 1991). Once the gas has cooled, these models employ 
simple physically based rules, akin to those used in hydro- 
dynamic simulations, to model star formation and evolution 
and a variety of feedback processes. The analytical nature of 



these models makes it possible to investigate galaxy forma- 
tion in large volumes and to include, in a controlled fashion, 
a variety of processes, such as dust absorption and emission, 
that are currently beyond the reach of hydrodynamic simu- 
lations. (For a review of this approach, see Baugh 2006). It 
is reassuring that the simplified treatment of gas cooling in 
these models agrees remarkably well with the results of full 
hydrodynamic simulations (Helly et al. 2003; Yoshida et al. 
2002). 

An important feature of semi-analytic models is that 
they are able to reproduce the local galaxy luminosity func- 
tion from first principles (e.g. Cole et al. 2000; Benson et al. 
2003; Hatton et al. 2003; Baugh et al. 2005; Kang et al. 
2005, 2006; Bower et al. 2006; Croton et al. 2006; Do Lucia 
ct al. 2006) and, in the most recent models, also its evo- 
lution to high redshift (Bower et al. 2006; De Lucia et al. 
2006; Lacey et al. 2008). These recent models also provide 
a good match to the distribution of galaxy colours, which 
is particularly relevant for problems relating to photomet- 
ric redshifts. And, of course, the models also calculate many 
properties which are not directly observable (e.g. rest-frame 
fluxes, stellar masses, etc) but which are important for the 
interpretation of the data. 

There are currently two main approaches to the esti- 
mation of photometric redshifts. One employs an empirical 
relation, obtained by fitting a polynomial or a more general 
function derived by an artificial neural network, between 
redshift and observed properties, such as fluxes in specified 
passbands (e.g. Connolly et al. 1995; Brunner et al. 2000; 
Sowards-Emmerd et al. 2000; Firth et al. 2003; CoUister & 
Lahav 2004). The second method is based on fitting the 
observed spectral energy distribution (SED) with a set of 
galaxy templates (e.g. Sawicki et al. 1997; Giallongo et al. 
1998; Bolzonella et al. 2000; Bem'tez 2000; Bender et al. 
2001; Csabai et al. 2003), obtained either from observations 
of the local universe (e.g. Coleman et al. 1980) or from syn- 
thetic spectra (e.g. Bruzual & Chariot 1993, 2003). Some 
authors (e.g. CoUister & Lahav 2004) claim that the em- 
pirical fitting method can give smaller redshift errors, but 
this method relies on having a well-matched spectroscopic 
subsample that reaches the same depth in every band as 
the photometric survey. Unfortunately, for Pan-STARRS or 
LSST this is going to be challenging and to be conservative 
in this initial investigation we use Hyper- 2; because it does 
not require a training set. 

In this paper, we describe a method for generating 
mock catalogues suitable, in principle, for the next gener- 
ation of large photometric surveys. As an example, we con- 
struct mocks surveys tailored to PSl. PSl will carry out 
two, 3-ycar long surveys in 5 bands {g, r, i, z, y), the "37r sur- 
vey" which will cover three quarters of the sky to a depth 
of about r = 24.5 and the "medium deep survey" (MDS) 
which will cover 84 sq deg to a 5 — cr point sources depth 
of r — 27 (AB system). The former will enable a compre- 
hensive list of large-scale structure measurements, includ- 
ing the integrated Sachs- Wolfe effect and baryonic acoustic 
oscillations. The latter will be used to study clustering on 
small and intermediate scales, as well as galaxy evolution. 
To construct the mocks we use the semi-analytic model of 
Bower et al. (2006, hereafter B06) as implemented in the 
Millennium simulation (Springel et al. 2005). A sophisti- 
cated adaptive template method based on the work of Ben- 
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der et al. (2001) will be employed for the genuine PSl survey. 
The method has been applied in the photo- 2: measurements 
of FORS Deep Field galaxies (Gabasch et al. 2004) and 
achieved Az/{1 + Zspoc) < 0.03 with only 1% outliers. How- 
ever, the method requires precise calibration of zeropoints 
in all filters using the colour-colour plots of stars, and a 
control sample of spectroscopic redshifts. Consequently this 
method cannot be rigorously tested until genuine PSl data 
is available. Therefore, for a first look at the photo-z perfor- 
mance of PSl, wc adopt the standard SED fitting method 
as implemented in the Hyper-z for our mock catalogues. 

The paper is organised as follows. In §2, we briefly sum- 
marise the models and detail the process of constructing 
mock galaxy catalogues. In §3 wc analyse some of their prop- 
erties and in §4 we use the mock catalogues to assess the 
accuracy with with photometric redshifts will be estimated 
by PSl. In §5, we discuss how these uncertainties are likely 
to affect the accuracy with which baryonic acoustic oscilla- 
tions, one the main targets for PSl, can be measured in the 
survey. Finally, in §6, we discuss our results and present our 
conclusions. 



2 MOCK CATALOGUE CONSTRUCTION 

In this section we describe how we construct mock cata- 
logues. In §2.1 we describe the semi-analytic galaxy forma- 
tion code that wo use. In §2.2 we compare the luminosities 
and sizes of the model galaxies to SDSS data and modify 
them to improve the accuracy of the mocks. Finally in §2.3 
we describe how the mock catalogues themselves are built. 

2.1 The galcLxy formation model 

The first step in the process of generating a mock catalogue 
is to produce a population of model galaxies over the re- 
quired redshift range. We use the galform semi-analytic 
model of galaxy formation (Cole et al. 2000; Benson et al. 
2003; Baugh et al. 2005; Bower et al. 2006) to do this, gal- 
form calculates the key processes involved in galaxy for- 
mation: (i) the growth of dark matter halos by accretion 
and mergers; (ii) radiative cooling of g£is within halos; (iii) 
star formation and associated feedback processes due to su- 
pernova explosions and stellar winds; (iv) the suppression 
of gas cooling in halos with quasistatic hot atmospheres and 
accretion driven feedback from supermassive black holes (see 
Malbon et al. (2007) for a description of the model of black 
hole growth); (v) galaxy mergers and the associated bursts 
of star formation; (vi) the chemical evolution of the hot and 
cold gas, and the stars. 

GALFORM uses physically motivated recipes to model 
these processes. Due to the complex nature of many of them, 
the model necessarily contains parameters which are set by 
requiring that it should reproduce a subset of properties of 
the observed galaxy population (see Cole et al. 2000; Baugh 
2006, for a discussion of the philosophy behind setting the 
values of the model parameters) . 

GALFORM predicts star formation histories for the pop- 
ulation of galaxies at any specified redshift. These histories 
are far more complicated than the simple, exponentially de- 
caying star formation laws sometimes assumed in the liter- 
ature (for examples of star formation histories of galform 



galaxies, see Baugh 2006). The GALFORM histories have the 
advantage that they are produced using an astrophysical 
model in which the supply of gas available for star formation 
is set by source and sink processes. The sources are the infall 
of now material due to gas cooling and galaxy mergers and 
gas recycling from previous generations of stars. The sinks 
are star formation and the reheating or removal of cooled gas 
by feedback processes. The metallicity of the gas consumed 
in star formation is modelled using the instantaneous recy- 
cling approximation and by following the transfer of metals 
between the hot and cold g£is, and the stellar reservoirs (see 
Fig. 3 of Cole et al. 2000). 

The model outputs the broad band magnitudes of each 
galaxy for a set of specified filters. In this paper wc use the 
PSl filter set {g, r, i, z, y) (Chambers 2006), augmented 
by a few additional filters where some of the PSl galaxies 
may be observed as part of other observational programmes 
([/, B, J, H, K). In addition to the magnitudes of model 
galaxies in the observer's frame, we also output the rest 
frame g — r colour, in order to distinguish between red and 
blue galaxy populations, galform tracks the bulge and disk 
components of the galaxies separately (Baugh et al. 1996). 
The scale sizes of the disk and bulge (assumed to follow 
an exponential profile and an r^''^-law in projection, respec- 
tively) are also calculated (Cole et al. 2000) (see Almeida 
et al. 2007, for a test of the prescription for computing the 
size of the spheroid component) . 

The BOG model which we use in this paper employs halo 
merger trees extracted from the Millennium N-body simula- 
tion of a A-cold dark matter universe (Springel et al. 2005). 
The model gives a very good reproduction of the shape of 
the present day galaxy luminosity function in the optical 
and near-infrared. Also of particular relevance for the pre- 
dictions presented here is the fact that this model matches 
the observed evolution of the galaxy luminosity function. 



2.2 Improving the match to SDSS observations 

To make the mocks as realistic as possible, we modify the 
luminosities and sizes of the model galaxies to give a better 
fit to SDSS data. Although both the 6j-band and K-band 
luminosity functions of the B06 model have been shown to 
agree well with observations of the local universe, the agree- 
ment is not perfect and a shift of 0.15 magnitude faint wards 
in all bands improves the match to the data as can be seen in 
Fig. 1. The original B06 K-band luminosity functions also 
match observations up to redshift z = 1.5 (Bower et al. 
2006), although the observational error bars are relatively 
large. Hence, even after applying the 0.15 magnitude shift 
the agreement between model and high redshift observations 
remains reasonably good. 

The GALFORM magnitudes we have been dealing with 
so far are total integrated magnitudes. In reality all but the 
most distant galaxies in the survey will be resolved over sev- 
eral pixels and will have lower signal to noise in each of these 
pixels than a point source would. To take this into account 
it proves convenient to use Petrosian (1976) magnitudes. 
These have the advantage over fixed aperture magnitudes 
that, for a given luminosity profile shape, they measure a 
fixed fraction of the total luminosity independently of the 
angular size and surface brightness of the galaxy. The Pet- 
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Figure 1. Luminosity functions predicted by GALFORM, compared with the SDSS results in the r-band (left) and 2-band (right), table 
from Blanton et al. (2003). The black lines with error bars indicated by the shaded region are the SDSS results. The blue lines show the 
original GALFORM prediction, while the red lines show the GALFORM prediction globally shifted faintwards by 0.15 magnitudes. 



rosian flux within Np times the Petrosian radius, rp, is: 

Fp = 2tt / J(r)rdr, (1) 

Jo 

where 7(r) is the surface brightness profile of the galaxy. 
The Petrosian radius is defined such that at this radius, 
the ratio of the local surface brightness in an annulus at rp 
to the mean surface brightness within rp, is equal to some 
constant value rj, specifically: 

_ 2nJ^';l':;- J(r)dr/[^((1.25rp)^ - (O.Srp)^)] 

We choose the parameter values as Np — 2 and ?7 = 0.2 as 
adopted in the SDSS (e.g. Yasuda et al. 2001). 

We decompose the surface brightness profile of each 
galaxy, I{r), into the superposition of a disk and a bulge: 
I{r) = /disk('") + /buige(f). The disk component is taken to 
have a pure exponential profile: 

IdiskW = /oe-^■^^'■/'■^ (3) 

and the bulge a pure de Vaucouleurs profile : 

W(r)=7oe-^-'^^'('-/'-^'^''l. (4) 

Given these assumptions, and assuming the disks are face- 
on, we can compute the Petrosian radius by solving Eqn. (2) 
for each GALFORM galaxy. 

While GALFORM does provide an estimate of the disk 
and bulge sizes, it has been shown by Almeida et al. (2007) 
that the early type galaxy sizes of the B06 model are not 
in particularly good agreement with the SDSS observational 



results. Therefore, for the purposes of producing more real- 
istic mocks, we modify the galaxy sizes so as to match the 
SDSS resuhs given by Shen et al. (2003). 

To do this we need to separate the galaxies into early 
and late types and apply separate corrections to each pop- 
ulation. First, we use the concentration parameter defined 
as C = R90/R50 to separate our galaxies into early and 
late types, where Rqo and R50 are the Petrosian 90% and 
50% light radii respectively. We then calculate the ratio of 
the GALFORM galaxy size to the mean found by Shen et al. 
(2003) for SDSS galaxies as a function of galaxy magnitude 
and obtain the average correction factor as a function of 
magnitude required for the GALFORM galaxies to match the 
SDSS size data. For early type galaxies at redshift z — 0.1 
Shen et al. (2003) parameterised the relation between Pet- 
rosian half-light radii in the r-band, R50, and absolute r- 
band magnitude, M, as 

log(i?5o) = -OAaM + b, (5) 

with a = 0.60 and b = —4.63. While for late type galaxies 
they found 

log(i?5o) = -0.4QM+(/3-a)log[l + 10-''-^(*'-'''°']+7, (6) 

with a = 0.21, /? = 0.53, 7 = -1.31 and Mo = -20.52. 
To correct the GALFORM galaxy sizes at other redshifts we 
adopt R50 oc 1 — 0.27z for late type galaxies and Rso oc 
— 0.332: -I- 1.03 for early type galaxies. The former is the re- 
lation given by Bouwens & Silk (2002) which agrees with a 
combination of SDSS, GEMS and FIRES survey data (Tru- 
jillo et al. 2006). The relation for early type galaxies is ob- 
tained by taking a linear fit to the data given by Trujillo 
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et al. (2006). Finally, we apply a linear relation between 
-R50 and the Petrosian radius, R50 = 0.47rp. 



2.3 Building the mock catalogues 

Our goal here is to generate mock catalogues which have 
the distribution of galaxy redshifts and magnitudes expected 
for the various PSl catalogues. For the purposes of this pa- 
per, we do not need to retain the clustering information 
contained in the Millennium Simulation. We are effectively 
generating a Monte-Carlo realisation of the redshift distri- 
bution expected for a given set of magnitude limits. The 
production of mock catalogues with clustering information 
will be described in a later paper. 

There are 37 discrete output epochs in the Millennium 
simulation between z = and z = 3. The spacing of the out- 
put times is comparable to the typical error on the estimated 
value of photometric redshifts, as we will see later. To avoid 
the introduction of systematic errors caused by the discrete 
spacing of simulation output times, previous work to build 
mock catalogues used an interpolation of galaxy properties 
between output times (Blaizot et al. 2005). We follow an 
alternative approach in this paper. We have generated nine 
additional outputs which are evenly spaced between each 
pair of Millennium simulation outputs. To produce GAL- 
FORM output at each of these intermediate steps, the Mil- 
lennium simulation merger trees ending at the nearest sim- 
ulation output are used but their redshifts are re-labelled 
to match the required redshift. Then GALFORM computes 
the star formation history up to the new output redshift, 
following the baryonic physics up to that point. This re- 
sults in a much finer spacing of effective output redshifts 
which fully takes account of k-corrections, star formation 
and stellar evolution, but ignores the evolution in the dark 
matter distribution between the chosen output redshift and 
the nearest simulation redshift. 

To generate a mock catalogue with a smooth redshift 
distribution we proceed as follows. At each of our closely 
spaced grid of redshifts, Zi, we have a GALFORM output 
dataset consisting of a set of GALFORM galaxies sampling 
a fixed comoving volume Vgf down to a sufficiently deep 
absolute magnitude. To each of these datasets we apply 
a magnitude limit and record the number of galaxies, Ni, 
that remain. The comoving number density of galaxies ex- 
pected brighter than the limit is then n{zi) = Ni /Vqf and 
the number we expect per unit redshift in the survey is 
n(z)^ldV/dz/dfl, where Q is the solid angle of the survey and 
dV / dz / dO. is the comoving volume per unit redshift and solid 
angle for the adopted cosmology. This can be used to com- 
pute, N{zi) — J^^^^l^ {dN{z)/dz)dz, the number of galax- 
ies expected in the survey in a redshift bin Az, centred at a 
given redshift, Zi. To create a continuous redshift distribu- 
tion we sample at random this number of galaxies from the 
corresponding GALFORM output and assign them a random 
redshift in the interval Az such that we uniformly sample 
the volume redshift relation. As we have perturbed the red- 
shift of each galaxy, we correspondingly perturb its apparent 
magnitude according to the difference in distance modulus 
between the output and assigned redshift. We will see that 
the residual redshift quantisation in the evolutionary and 
k-corrections is small compared with the precision achiev- 
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Figure 2. Galaxy number counts in 0.5 magnitude centred bins 
predicted by the GALFORM model in the r-band (blue solid line 
with error bars), compared with the SDSS commissioning data 
(Yasuda et al. 2001) (red crosses) and the DEEP2 survey data 
(Coil et al. 2004) (green dots with error bars). The agreement 
between the model and the data is excellent. 



able for the photometric redshifts. Therefore these residual 
discreteness effects are not important in the photometric 
redshift error estimation. 

For the purpose of producing predictions for the red- 
shift distribution and number counts of galaxies in PSl sur- 
veys, and to provide an input catalogue with which to test 
photometric redshift estimators, we generate a mock cata- 
logue which corresponds to a solid angle of 10 square degrees. 
We generate predictions for the Stt survey and the MDS by 
scaling the results from this mock to take into account the 
difference in solid angle. In a later paper, we will generate 
mock catalogues for clustering applications which will have 
the full sky coverage of these surveys. 

Finally we need to apply the magnitude limit. To do 
this, we use a Gaussian random number generator to sample 
the noise level A^^ of each galaxy for the specific survey under 
consideration. The galaxy source fiux 5* is also perturbed by 
its noise, Sr = S + Nr. We apply a 5a cut for selection by 
rejecting galaxies with signal-to-noise ratio lower than 5. 



3 PSl MOCK CATALOGUES 

In this section, we apply the methodology described in §2 to 
the specific case of the PSl survey. We begin by calculating 
the magnitude limits which we expect to be reached in the 
Stt and MDS surveys for both point and extended sources 
after one and three years of observations respectively. 
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Table 1. Estimated PSl Stt and Medium Deep Survey (MDS) sensitivities. The Stt survey will cover three quarters of the sky, while the 
MDS will cover 84 sq deg of the sky in 10 separate regions, mi and n are defined in section 3.1. 



Filter 


Bandpass 


mi 




exposure time 


Se- 


50- 


Se- 


Se- 




(nm) 


AB 


AB 


in 1st yr (Stt) 


pt. source 


pt. source 


pt. source 


pt. source 






mag 


mag/arcsec ^ 


sec 


in 1st yr (37r) 


in 3rd yr (37r) 


in 1st yr (MDS) 


in 3rd yr (MDS) 


9 


405-550 


24.90 


21.90 


60x 4 


24.04 


24.66 


26.72 


27.32 


r 


552-689 


25.15 


20.86 


38x 4 


23.50 


24.11 


26.36 


26.96 


i 


691-815 


25.00 


20.15 


60x 4 


23.39 


24.00 


26.32 


26.91 


z 


815-915 


24.63 


19.26 


30x 4 


22.37 


22.98 


25.69 


26.28 


y 


967-1024 


23.03 


17.98 


30x 4 


20.91 


21.52 


24.25 


24.85 



Table 2. Estimated UKIDSS sensitivities. All the magnitudes are in the AB system. The Large Area Survey (LAS) aims to map about 
4000 sq deg of the Northern sky within a few hundred nights. The Deep Extragalactic Survey (DXS) aims to map 35 sq deg of the sky 
in three separate regions. 



Filter 




mi 


M 


exposure time 


Se- 


exposure time 


Se 




(nm) 


AB 


AB 


(LAS) 


pt. source 


(DXS) 


pt. source 






mag 


mag/asec^ 


sec 


(LAS) 


h 


(DXS) 


J 


1229.7 


23.80 


16.80 


40x 4 


20.5 


2.1 


23.4 


H 


1653.3 


24.58 


15.48 


40x 4 


20.2 






K 


2196.8 


24.36 


15.36 


40x 4 


20.1 


1.5 


22.86 



3.1 The magnitude limits for the PSl Stt and 
MDS surveys 

The signal registered on a CCD chip from a point source 
with total apparent magnitude m, after an exposure time of 
t seconds is: 

5 = 0.5 t X 10-°-*("'-"'i\ (7) 

where rni is the magnitude tliat produces 1 electron per 
second. The factor of 0.5 comes from cissuming the PSF is a 
2D Gaussian profile and integrating over the FWHM of this 
profile. 

The signal-to-noise ratio for a point source is given by: 
S/N = S/^al + al + al^ + al, (8) 

where al = 0.5f x iQ-o <m-mi) ^-^^ Poisson counting 
noise for a source of magnitude m observed for t seconds; 
al = f X lO-°-''(^-'"i)t is the variance from the sky back- 
ground, where /i is the average sky brightness in magnitudes 
per square arcsec and w, assumed to be 0.78 arcsecs, is the 
FWHM of the PSF; = f '^^ x x N"^^^^ is the road-out 
noise of the detector, where, for PSl, A=3.846 pixels/arcsec 
and iVread = 5 is the read-out noise in electrons; o-q is the 
variance due to dark current and will be assumed to be zero 
(Chambers 2006). Table 1 lists the expected values of the 
parameters ^ and nii and also gives the 5(j point source 
magnitude limits resulting from applying this formula to 
the Stt survey and the MDS after one and three years. 

The signal-to-noise for resolved extended sources will be 
smaller. To estimate this we take the Petrosian radius and 
the redshift of a galaxy and obtain the solid angle subtended 
by 2rp of the galax;y, Og. Then for extended sources, 9^ > u, 
we define the signal and the noise to be the values integrated 
over the source aperture 6g rather than the FWHM of the 



PSF. Thus the signal is simply S = tx iQ-o-i(m-rm) ^ ^-^^ 
Poisson noise a^ ^ t x iQ-oMm-mi) ^ ^^le sky background 
variance is erg = f 6'g x io-''-4(f-™i)t and the read-out noise 
is ctrn = f X X N'^^^^. Since we have not convolved 
the image with the PSF, this treatment would produce a 
sharp transition in the noise level at the PSF limit. This 
can be avoided by approximating the convolved diameter of 

/ ? 9 \ 1/2 

the image by {6g + Op) and using this to replace dg in 
the expressions for erg and it|^n- Table 1 gives the 5a point 
source magnitude limits resulting from applying this formula 
to the Stt survey and the MDS after one and three years. 

3.2 A test of the PSl mock catalogues 

Before discussing predictions from our mock catalogues for 
the Stt and MDS PSl surveys, we first carry out a simple 
test of the realism of our mock catalogues. The galform 
semi-analytic model has been shown to be consistent with 
various basic properties of the local galaxy population such 
as the luminosity functions in the bj and K bands (Cole 
et al. 2001; Norberg et al. 2002; Huang et al. 2003). The 
B06 version of the model also gives an excellent match to 
the evolution of the rest-frame K-band luminosity function, 
including the data from the K20 (Pozzetti et al. 2003) and 
MUNic surveys (Drory et al. 2003) up to redshift z = 1.5 
(Bower et al. 2006). 

Since neither the bj nor the K-h&nd coincide with any of 
the PSl grizy bands, for a more direct test we compare pre- 
dicted galaxy number counts in the r-band with data. At the 
faint end we use the number counts over 5 sq deg from the 
DEEP2 survey (Coil et al. 2004), which are complete to 24.75 
in the R band. To minimise sample variance at the bright 
end, we use galaxy number counts in the SDSS commission- 
ing data (Yasuda et al. 2001) which cover about 440 sq deg 



Mock galaxy catalogues 7 





Figure 3. Expected galaxy number counts in tlie 3-year PSl Stt survey (top panels) and the Medium Deep Survey (MDS) (bottom 
panels), as predicted by the GALFORM model. A 5cr cut on Petrosian magnitudes has been used for selecting galaxies. Left: galaxy number 
counts in 0.5 magnitude bins per sq deg in the PSl g, r, i, z, and y bands. The black dashed lines show the g band galaxy number counts 
limited only by the point source limits. Right: cumulative galaxy number counts as a function of magnitude, N(< x), where x denotes 
PSl g, r, i, z, or y bands, as indicated in the legend. The straight lines show the 3-year 5cr point source magnitude limits. 



and are complete to r* = 21. (We have checked that the 
difference between the commissioning data and more recent 
SDSS releases (Fukugita et al. 2004; Yasuda et al. 2007) is 
negligible). We compute the GALFORM model predictions, 
including uncertainties, from 10 realizations of 10 sq deg 
mock surveys. The results, displayed in Fig. 2, show that 



our model prediction agrees very well with both the deep2 
and the SDSS datasets. Note that we have applied the 0.15 
magnitude shift discussed in §2.2 to the model galaxies. 
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Figure 4. Expected galaxy redshift distributions for galaxies detected in all 5 {g,r,i, z,y) PSl bands in the 3tt survey (top panels) and 
the Medium Deep Survey (MDS) (bottom panels), as predicted by the GALFORM model. The left-hand panels give the differential counts, 
in bins of Az = 0.02. The right-hand panels give the cumulative counts. Blue lines show results for all galaxies while the red lines refer 
exclusively to red galaxies. Solid lines are for the 3-year surveys and dotted lines for the 1-year surveys. Red galaxies are selected by a 
rest-frame colour cut of Mg — Mr > — 0.04Mr — z/15 — 0.25, where z is the redshift (see Fig. 14). Note that these predictions have been 
extrapolated from a mock catalogue which covers 10 square degrees, and so are noisier than would be expected for the actual survey 
sizes. 



3.3 Expected PSl galaxy numbers counts and 
redshift distributions 

We now discuss the expected population statistics for the 
PSl surveys predicted by our mock catalogues. We apply 



Petrosian magnitude cuts in each of the PSl bands and plot 
the expected galaxy number counts in 0.5 magnitude bins 
in Fig. 3, for both the 3-year 37r survey and the MDS. The 
figure shows that, with the Petrosian magnitude cuts, the 
samples are no longer complete to the 5a point source mag- 
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Figure 5. As Fig. 4 but for galaxies required to be detected only in the g,r,i, and z bands. Without requiring the shallow y-band 
detection the number of galaxies is about twice as large as with the full g, r, i, z and y constraints. 



nitude limits in the various bands, but rather only to ~2 
magnitudes brighter. Note that the y-band magnitude limit 
is substantially shallower than the others and so, if one re- 
quires detection in all five bands, the y limit is the most 
restrictive. 

The cumulative distributions on the right hand panels 
of Fig. 3 reveal the staggering number of galaxies that will 
be detected by PSl. For example, in the g-band after 3 years, 



we expect about 10 galaxies in the 37r survey and nearly 
10** in the MDS. 

The expected redshift distributions for the two surveys 
are shown in Fig. 4 for galaxies detected in all 5 bands and 
in Fig. 5 for galaxies detected only in g, r, i and z, i.e. not 
requiring the shallow j/-band detection. In the first case, the 
n(z) distribution peaks at z ~ 0.5 for the 37r survey, with 
about 8 X lO'^ and 1.8 x 10* galaxies detected (in all 5 bands) 
in the 1- and 3-year surveys respectively. The survey is so 
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Figure 6. Ratio of differential (left) and cumulative (right) counts for galaxies selected using a combination of r-band and one other 
filter to the counts of galaxies selected using the r-band alone, as a function of r-band magnitude. For the additional filters, wc use the 
UKIDSS J, H and K bands, the PSl g,i,z, and y bands and a C/-band . For the PSl grizy system, we adopt the third year magnitude 
cuts and for the UKIDSS bands the LAS limits; for the [/-band we assume a limit of 23 mag. The label r + x denotes that galaxies are 
selected by combining the r-band and one of the other bands. The vertical blue lines indicate the r-band 5(t point source detection limit 
after the 3-year surveys. 



huge, that, after 3 years, we expect about 10 mihion galaxies 
a.t z > 0.9 and 5 million at z > 1. For the MDS survey, the 
n(z) distribution peaks at z ~ 0.8, with a total of 1.7 x 10^ 
galaxies after 3 years of which around 0.5 million lie at z > 2. 
Removing the j/-band constraint leads to a large increase in 
the number of galaxies, as shown in Fig. 5. In this case, the 
37r survey will contain ~ 5 x 10* galaxies after three years, 
with about 30 million at 1 < 2 < 1.3, while the MDS will 
contain ~ 3 x lO'^ galaxies, with 4 million at z > 2. 

For certain applications, for example, for the estimate 
of photometric redshifts discussed in the next section, it 
might be desirable to supplement the PSl grizy filter sys- 
tem with other bands, particularly in the near infrared. The 
UKIDSS Infrared Deep Sky Survey (e.g. Lawrence et al. 
2007; Hewett et al. 2006) is particularly relevant in this con- 
text. The UKIDSS Large Area Survey (LAS) aims to map 
about 4000 sq deg of the Northern sky (contained within 
the 37r survey) over the course of a few hundred nights. The 
Deep Extragalactic Survey (DXS) aims to cover 35 square 
degrees of the sky in three separate regions which have a 
large overlap with the fields chosen for the MDS. Details of 
the J, H and K magnitude limits of the UKIDSS surveys 
are listed in Table 2. In order to assess the compatibility of 
the PSl and UKIDSS surveys, we show in Fig. 6 the reduc- 
tion in galaxy counts, relative to a pure r-band selection, 
that would result from combining in turn each of the filters 
with the r-band filter. We see, once again, that the y-band 
cut (green line) is much shallower than the other PSl bands. 
The UKIDSS (LAS) J, H and K bands are even shallower. 
Combining r-band and ?7-band detections also results in a 



large reduction in the counts even for an optimistic t/-band 
limit of 23 mag. Fig. 6 suggests that, in spite of the large 
area overlap with the 37r survey, the UKIDSS (LAS) survey 
may be too shallow to pick up PSl galaxies at high red- 
shifts. It will be very difficult for a [/-band survey to pick 
up a significant number of PSl galaxies. 



4 PHOTOMETRIC REDSHIFTS IN THE PSl 
SURVEY 

We now examine the accuracy with which redshifts are likely 
to be estimated using PSl photometry. For this purpose we 
adopt an off-the-shelf photometric redshift code, the Hyper- 
z code of Bolzonella et al. (2000) which is based on fitting 
template spectra. We do not attempt to tune the perfor- 
mance of the estimator in anyway, and so our results should 
perhaps be regarded as providing a pessimistic view of the 
photometric redshift performance of PSl. Once PSl data be- 
come available, bespoke estimators will be developed which 
are optimized to return the smallest random and systematic 
errors for the PSl filter set and galaxies by having empiri- 
cally adaptive galaxy templates (e.g. Bender et al. 2001). 

The basic principle behind the template fitting ap- 
proach to photometric redshift estimation is the following. 
The observed SED of a galaxy is compared to a set of tem- 
plate spectra and a standard minimisation is used to 
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Figure 7. True ("spectroscopic") redshifts plotted against photometric redshifts in a 10 sq deg mock PSl Sir 3-year galaxy catalogue. 
In each bin of photo-2 the contours show the regions containing 50% (blue), 70% (red), 90% (orange) and 95% (green) of the galaxies. 
Galaxies with true redshifts falling outside the 95% contours are shown by the green dots. Galaxies are selected by applying 5cr Petrosian 
magnitude cuts for all 5 PSl grizy bands. If the flux in some other filters {U, B, J, H or K) drops below its 5a" limit, the detected flux 
is still used with its uncertainty. The error bars show the rms scatter after So- clipping. The percentages of galaxies retained after the 
clipping are given in the legend. Top left: PSl grizy band data only. Top right: PSl grizy combined with (7-band. Bottom left: PSl 
grizy combined with UKIDSS (LAS) J and K. Bottom right: PSl grizy, (/-band and UKIDSS (LAS) J and K. 



obtain the best fit: 



-Fobs,, - bX Ftem,i(-Z) 



(9) 



where fobs,!, Jtcm,i and ai are the observed fluxes, template 
fluxes and the uncertainty in the flux through filter i, respec- 
tively and 6 is a normalization factor. For the fitting proce- 
dure, we input the PSl grizy-ha.nd filter transmission curves 
and, when appropriate, those of the UKIDSS near infrared 



bands and a U band filter. We consider different reddening 
laws and two sets of model templates: the mean spectra of 
local galaxies given by Coleman et al. (1980, CWW) and the 
synthetic spectra given by Bruzual & Chariot (1993, BC). 
We set a redshift range of < z < 3 for the Stt and < 2 < 4 
for the MDS sample. 

One might be concerned that the use of Bruzual & 
Chariot stellar population synthesis models to generate both 
the template spectra and the galaxy spectra in the mock cat- 
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Figure 8. Accuracy of the photometric redshift estimates in a 10 sq deg mock PSl Stt 3-year galaxy catalogue. Only galaxies remaining 
after applying a 3o" clipping procedure to the binned data are retained in the estimate. The retained fractions are given in the legend of 
Fig. 7. We use a redshift bin size of Az = 0.05. Left panel: la uncertainty divided by (1 + z) plotted against the photo-2 Right panel: 
Systematic deviation of the mean photometric redshift in each bin from the true value, as a function of photo-2. 
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Figure 9. As Fig. 7, but using a larger, deeper sample by not requiring a y band detection and using only griz fluxes in the determination 
of photometric redshifts. Without the j;-band, more galaxies are detected, but the error and bias in the photometric redshifts increase. 
Left panel: results when using only the PSl photometry. Right panel: results when adding UKIDSS (LAS) J, X-band and with {/-band 
photometry. 
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Figure 10. "Spectroscopic" versus photometric redshifts, as in Fig. 7 and Fig. 9, but for galaxies that are required to be red in their 
rest frame g — r colour. Top panels: deep samples in which no j/-band detection is required. The left hand panel makes use of only PSl 
griz photometry, while the right hand panel makes use of additional UKIDSS (LAS) J, X-band and fiducial [/-band photometry. Bottom 
panels: These panels show the results for the shallower sample in which detections in all 5 {grizy) PSl bands are required. Again the 
left hand panel uses only PSl data and the right hand panel makes use of additional UKIDSS (LAS) J, A'-band and fiducial l/-band 
photometry. 



alogues might lead to an underestimate of the error on the 
photometric redshift. There are two key differences between 
the mock spectra and the templates which mean that this is 
not an issue: i) the complexity of the composite stellar pop- 
ulations of mock galaxies and ii) the differing treatments of 
dust extinction. The template spectra correspond to a single 
parameter star formation history (characterized by an expo- 
nentially decaying star formation rate, where the e-folding 
time is treated as a parameter) and a fixed metallicity for 
the stars. The mock galaxies, on the other hand, have com- 



plicated star formation histories which cannot be fitted by a 
decaying exponential (see Baugh 2006 for examples of star 
formation histories predicted by the semi-analytical mod- 
els). Furthermore, the stars in the mock galaxy have a range 
of metallicities. Hyper-z, in common with many other pho- 
tometric redshift estimators, assumes that dust forms a fore- 
ground screen in front of the stars with a particular extinc- 
tion law. In GALFORM, the dust and stars are mixed together. 
This more realistic geometry can lead to dust attenuation 
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Figure 11. Accuracy of the photometric redshift estimates in a 10 sq deg mock PSl Stt 3-year red galaxy catalogue. Results using only 
griz fluxes in the determination of photometric redshifts are shown together. Without the i/-band, more galaxies detected, but the error 
and bias in the photometric redshifts increase. Only galaxies remaining after applying a 3(t clipping procedure to the binned data are 
plotted. We use a redshift bin size of Az = 0.05. Left panel: la uncertainty divided by (1 + z) plotted against the photo-z. Right panel: 
Systematic deviation of the mean photometric redshift in each bin from the true value, as a function of photo-2. 



curves which look quite different from those assumed in the 
photometric redshift code (Granato et al. 2000). 

The Hyper-2 code calculates a redshift probability dis- 
tribution, P{z), for each galaxy. Because of a degeneracy 
between the 4000 A and the 912 A breaks, the shape of P{z) 
can have a double peak, causing some low redshift galaxies 
to be misidentified as high redshift galaxies and viceversa. 
Some of these misidentifications can be removed by applying 
extra constraints, for example, the galaxy luminosity func- 
tion and the differential comoving volume as a function of 
redshift (Mobasher et al. 2007). For a given observed flux, 
both these functions provide an estimate of the probability 
that the galaxy has redshift z which can be use to modulate 
P{z)- The highest peak in the combined probability distri- 
bution gives the best estimate of the photometric redshift. 
We use the r-band luminosity function of the B06 model for 
this purpose. 

We now discuss how the accuracy and reliability of the 
photometric redshift estimates depends on various choices. 
We do this by calculating photometric redshifts for a 10 
sq deg subsample of our mock PSl Svr 3-year catalogue and 
comparing these with the true redshifts (which we will some- 
times refer to as the "spectroscopic" redshifts.) 

1. Choice ofSED template (CWW vs BC) 

Our tests show that using 5 input spectral types: burst, 
SO, Sa, Sc and Im, gives good results; adding more spectral 
types does not produce further significant improvement. We 
find that fitting with the CWW templates gives larger statis- 
tical uncertainties and systematic deviations from the true 
redshift than fitting with the BC templates, especially at 



high redshift {z > 1). The reason for this could be that the 
CWW templates are based on observations of the local uni- 
verse and may not be sufficiently representative of galaxies 
at high redshift. In what follows, we will exclusively use the 
BC templates. 

We also experimented with BC templates for different 
metallicities. Because of the age-metallicity degeneracy in 
galaxy SEDs, we did not find any improvement by allowing 
the metallicity to vary while letting the age of the stellar 
populations be a free parameter. Since the 4000 A break 
only becomes detectable after a stellar population has aged 
beyond 10^ years, we exclude templates with ages smaller 
than this. This greatly improves the results for low redshift 
galaxies {z < 0.5). 

2. Dependence on photometric bands 

The accuracy of the redshift estimates depends on the 
choice of photometric bands. With our 10 sq deg 3-year mock 
catalogues, we can explore which combination of bands gives 
optimal results for PSl. We have considered many combi- 
nation of the PSl grizy photometry with UKIDSS (LAS) 
JHK and fiducial B and U photometry. Note that, if the 
fiux through any of the U, B, J, H or K filters drops below 
the 5(7 fiux limit, the noisy measured flux is still used with 
its appropriate uncertainty. 

We flnd that the B-band (whose effective wavelength is 
very close to the g band) does not improve the fits if U is 
available, but the f7-band is still useful even when _B-band 
data are included. We also find that the //-band is not im- 
portant provided J and K are available. However, both J 
and K are important for improving the quality of the fits. 
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Figure 12. True ("spectroscopic") redsliifts plotted against photometric redshifts for the 3-year MDS survey. The data are presented in 
the same fashion as in Figs 7,8 and 12, but for the MDS we extend the redshift range to 2 = 3.5. Top panels; predictions for the 3-year 
MDS using 1 sq deg mock catalogues. Bottom panels: predictions for samples of red galaxies. Left panels, results by using only the grizy 
photometry. Right Panels, results by adding UKIDSS (DXS) J, i^-band and with t/-band photometry. Galaxies are selected applying 5(t 
Petrosian magnitude cuts for all 5 PSl grizy bands. If the flux in some other filters (U, B, J, H or K) drops below its 5cr limit, the 
detected flux is still used with its uncertainty. The error bars show the rms scatter after 3a clipping. The percentages of galaxies retained 
after the clipping are given in the legend. 



Therefore, in what follows we will ignore B and H. Our re- 
sults are displayed in Figs. 7 and 8. In Fig. 7, we plot the 
"spectroscopic" redshifts against our estimated photometric 
redshifts for the 4 cases above. For clarity, rather than plot- 
ting each galaxy on these plots, we have instead displayed 
contours that indicate the region in each bin of photo-2 that 
contains 50%, 70%, 90% and 95% of the galaxies. Galaxies 
with "spectroscopic" redshifts falling outside the 95% con- 
tours are shown individually by green dots. To evaluate the 



la scatter we eliminate extreme outliers through standard 
3(T clipping. Typically over /ret = 95% of the galaxies are 
retained, as indicated in the legend. Fig. 7 plots the true 
or "spectroscopic" redshift against our estimated photomet- 
ric redshift for the PSl grizy photometry alone and when 
supplemented by (7-band photometry, UKIDSS (LAS) pho- 
tometry, or both. Fig. 8 (left panel) shows Az/{l + z) plotted 
against redshift where Az is the la error from Fig. 7. The 
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Figure 13. Accuracy of the photometric redshift estimates for the 3-year MDS survey shown in Fig. 12. Only galaxies remaining after 
applying a 3(t clipping procedure to the binned data are plotted. Galaxies are binned in the spectroscopic redshift axis with bin size 
Az = 0.05. Left panel: la uncertainty divided by (1 + z) plotted against the photo-z. Right panel: Systematic deviation of the mean 
photometric redshift in each bin from the true value, as a function photometric redshift. 



bias in the mean of each redshift bin relative to the true 
value is also shown (right panel). 

The PSl grizy bands alone give relatively accurate pho- 
tometric redshifts in the range 0.25 < z < 0.8, with typical 
rms values of Az/{l + z) ~ 0.06. The random and systematic 
errors increase at both lower and higher redshifts and there 
is a population of low redshift {z < 1) galaxies which are 
incorrectly assigned high redshifts. Adding the (7-band pro- 
duces only a moderate improvement at all redshifts. Using 
both the J and K bands results in a significant improvement 
at z < 0.5, but not at higher redshifts. Finally, combining 
the U, J and K, produces the best results. For this best case, 
the rms error, Az/{1 + z) ^ 0.05, in the range 0.5 < z < 1 
and, for z < 1.2, it is never larger than 0.15. 

We saw in §3.3 that requiring that galaxies be detected 
in y, the shallowest PSl filter, reduces the sample size by 
factors of 2-3. The deeper sample that we achieve by only re- 
quiring griz detections has significantly less accurate photo- 
zs. This is shown in Fig. 9, in which we measure photometric 
redshifts using only griz photometry. In this, the rms in the 
redshift range 0.25 < z < 0.8 increases from 0.06 to 0.075 
and the bias changes little. 

Photometric redshift estimates for the MDS are shown 
in the top 2 panels of Fig. 12 and their accuracy is quantified 
by the green and blue lines in Fig. 13. If only the PSl grizy 
are available, an accuracy of Az / (1 + z) ~ 0.05 is achievable 
for 0.02 < z < 1.5. Adding the UKIDSS (DXS) and the U- 
band improves the estimates considerably, but there is still 
a clear bias at very low and high redshifts. This is mainly 
because the depths of the UKIDSS (DXS) and our assumed 
U band is insufficient to match the depth of the MDS so faint 
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Figure 14. Expected colour-magnitude relation for the MDS 
3-year mock catalogue. The plots show rest-frame g — r colour 
versus rest-frame r-band magnitude predicted by GALFORM at 
the redshifts given in each panel. The blue line is Mg — Mr = 
— 0.04Mr — 2/15.0 — 0.25, where 2 is the redshift. Galaxies above 
the line make up the "red" sample. 
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Figure 15. Predicted redshift distributions for "red" galaxies 
in thie PSl Sir survey, selected in two different ways. The red 
lines show results for a sample selected by rest-frame g — r colour 
(according to - Mr > -0.04Mr - ^/15 - 0.25); the black lines 
show results for a sample selected by the best fit photo- 2: spectral 
type, with detail in the text. The redshift bin is Az = 0.02. The 
good agreement between the two selection methods suggests that 
it may be possible to select the red sample directly from the 
observed photometry. 



galaxies which are best fit with the 'Burst' spectral type and 
a stellar population older than 10^ yr. 

Fig. 15 shows the redshift distribution for this sample 
which can be seen to be very similar to the redshift distri- 
bution of a red sample selected by rest-frame g — r colour. 
This suggests that it will be possible to select a red galaxy 
sample directly from the observational data alone. 

Fig. 10 and Fig. 12 show photometric redshift estimates 
for red galaxies in the 3-year Stt survey and the MDS respec- 
tively. Their accuracy is illustrated by the magenta {grizy 
photometry only) and red lines {grizy+JK+U) in Figs. 11 
and Fig. 13 respectively. Results without the y-band pho- 
tometry are shown in the top panels of Fig. 10 and in green 
{griz photometry only) and blue (griz+JK+U) lines in 
Fig. 11. These figures show the dramatic improvement in 
photometric redshift accuracy for red galaxy samples. For 
example, for the 37r survey, the rms value of Az/{1 + z) can 
be as low as 0.02 at z ~ 0.8 when combining grizy with 
UKIDSS (LAS) and U bands measurements. Similarly, in 
the MDS with the same combination of filters, the accu- 
racy for red galaxies is much higher than for the sample as 
a whole and can be as good as Az/(1 + z) ~ 0.03 in the 
redshift range 0.75 < z < 2.5. 

Finally, we consider the form of the distribution of the 
photo-2 errors in Fig. 16. The photo-z error distributions are 
well fitted by a Gaussian function, with variance ~ Az- 
The error distribution could also be equally well fitted by 
a Lorentzian function. Example distributions are shown at 
z ~ 0.3 and ~ 0.5 in Fig. 16. An application of our results 
for the size and form of the photo-z errors is presented in 
the next section, in which we investigate their effect on the 
baryonic acoustic oscillation measurements. 



galaxies are not detected in the UKIDSS J and K band nor 
in the U band. 

For certain applications, for example, the measurement 
of baryonic acoustic oscillations discussed in the next sec- 
tion, smaller rms errors than those found above are required. 
These can be achieved by selecting subsamples of galaxies 
whose spectra are particularly well suited for the determi- 
nation of photometric redshifts, such as red galaxies which 
have strong 4000 A breaks. The most direct way to define 
a red subsample is by using the rest frame g ~ r colours. In 
Fig 14, we plot the predicted rest-frame g — r against r-band 
luminosity at four different redshifts in our mock MDS cat- 
alogue. A cut at Mg-Mr > -0.04Mr-z/15.0-0.25 neatly 
separates out the red sequence, particularly at z < 1. The 
redshift distributions of red galaxies defined this way are 
shown by the red lines for both the 37r survey and the MDS 
in Figs. 4 and 5. The distributions peak at slightly lower 
redshifts than the full samples, but there is still an impres- 
sive number of red galaxies in the two surveys. For example, 
in the 37r survey we expect about 200 million galaxies after 
3-years if detection in y is not required or 100 million if it 
is. 

In practice, rest-frame g — r colours are difficult to esti- 
mate from the observations. An alternative method for iden- 
tifying red galaxies is to use the spectral type determined by 
Hyper-z. We define a red sample by the following criteria, 



5 IMPLICATIONS FOR BAO DETECTION 

In this section we investigate the impact of using photo- 
metric redshifts on the accuracy with which the baryonic 
acoustic oscillation (BAO) scale can be measured from the 
power spectrum of galaxy clustering. BAOs have been pro- 
posed as a standard ruler with which the properties of the 
dark energy may be measured (Blake & Glazebrook 2003; 
Linder 2003). Our aim here is to provide a simple quan- 
tification of the factor by which the effective volume of a 
survey is reduced when photometric redshifts are used in 
place of spectroscopic redshifts. This will provide a rule of 
thumb indicator of the relative performance of photometric 
and spectroscopic surveys for the measurement of BAO. We 
defer a more extensive treatment of the full impact of the 
survey window function on the measurement of BAOs to a 
later paper. Mocks with clustering will play an important 
role in assessing the optimal way to measure the clustering 
signal in photometric surveys. 

The photometric redshift technique allows large solid 
angles of sky to be covered to depths exceeding those acces- 
sible spectroscopically at a low observational cost. However, 
the inaccurate determination of a galaxy's redshift results 
in an uncertainty in its position and this leads to a distor- 
tion in the pattern of galaxy clustering. We shall refer to a 
measurement of the power spectrum which uses photomet- 
ric redshifts to assign radial positions as being in "photo-z" 
space. 
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Figure 16. The distribution of plioto-2 errors at redstiift z ~ 0.3 and z ~ 0.5 for tlie 3-year 37r galaxy catalogues. The histograms are 
normalized to integrate to unity. Histograms in blue (z ~ 0.3) and red (z ~ 0.5) show the errors resulting from combining the grizy 
bands with UKIDSS (DXS) J, X-band and with (7-band photometry. They could be equally well fitted by Gaussian and Lorentzian 
distributions, ctz is the rms width of the Gaussian function and is the FWHM of the Lorentzian function. Dotted lines show the 
best-fit Gaussians and the dashed lines illustrate the best-fit Lorentzian functions. Left: All galaxies, Right: Red galaxies. 



The errors introduced by photometric redshifts can be 
modelled as random perturbations to the radial positions of 
galaxies. As we have found from our photo-z measurements 
that the photo-z errors can be well fitted by a Gaussian 
function, if we assume that these perturbations are Gaussian 
distributed with mean equal to the true redshift and width 
o"z ~ Aa, then the Fourier transform of the measured density 
field, 5pz(fc), can be written as 

5v.{k) = S^k) exp(-0.5fc,^ a!), (10) 

where fez = k.z, z is the line-of-sight direction and Sz{k) 
is the density field measured in redshift space. From this 
expression, the spherically averaged power spectrum can be 
approximately^ written as: 

Pp.W = P.wf (11) 

where Erf(a;) = J^^ exp{—t^)dt is the error function. In 
addition, the power spectrum in photo-z space can be seen 
as that in redshift space with additional damping on small 
scales due to the large value of a^. On very large scales the 
main contribution to the power spectrum comes from modes 
with wavelengths larger than the typical size of the pho- 
tometric redshift errors. Therefore, the clustering on these 
scales is essentially unaffected. On the contrary, on scales 

^ It is an approximate expression since the redshift space distor- 
tions and photometric redshift errors do not commute under a 
spherical average (see Peacock & Dodds 1994). 



comparable to and smaller than the photo-z errors, struc- 
tures are smeared out along the line-of-sight. The modes 
describing these scales along the line-of-sight contain little 
information about the true distribution of galaxies and con- 
tribute only noise to the power spectrum. 

We investigate these effects directly on the measure- 
ment of the matter power spectrum using large N-body sim- 
ulations. We use the l-basicc ensemble of Angulo et al. 
(2008), which consists of 50 low-resolution, large volume 
simulations. Each has a volume of 2A{pc/h)^ and resolves 
halos more massive than 1 x IO^^Mq/Zi. The assumed cos- 
mological parameters are Qm = 0.25, f^A = 0.75, h = 0.73, 
n = 1 and erg = 0.9. Their huge volume makes the l-basicc 
simulations ideal to study the detectability of BAO in fu- 
ture surveys. Photometric redshift errors are mimicked as a 
random perturbation added to the particles' position along 
one direction (line-of-sight). The perturbations are drawn 
from a Gaussian distribution with various widths represent- 
ing different degrees of uncertainty in the photometric red- 
shift. Despite their large volume, the l-basicc boxes are 
more than an order of magnitude smaller than the volume 
which will be covered by the Stt survey. Hence, we present 
results for the relative change expected in the random er- 
rors for different photometric redshift errors. Angulo et al. 
found that any systematic error in the recovery of the BAO 
scale was comparable to the sampling variance between L- 
BASICC realizations. To address the question of systematic 
errors we will need to use even larger volume simulations. 
Furthermore, new estimators are likely to be developed to 
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Figure 17. The mean and standard deviation of the darlc matter power spectrum averaged over an ensemble of 50 N-Body simulations 
at z = 0.5. The top-panels display the power spectrum in three different cases: (i) redshift space (solid red line), (ii) photo-2 space (blue 
line) in which the position of each dark matter particle has been perturbed to mimic the effect of photometric redshift errors, and (iii) the 
photo-2 space power spectrum derived from Eq. (11) and the measured redshift space power spectrum (red dashed lines). The horizontal 
dashed line illustrates the shot-noise level. In the bottom panels we plot the photo-z power spectrum divided by a smooth reference 
spectrum. This reveals the impact of photometric redshift errors directly on the baryonic acoustic oscillations (BAO). An increase in these 
errors causes an increase in the noise and a decrease in the amplitude of the BAO at high wavenumber. This implies that photometric 
redshifts affect scales much larger that the photometric redshift errors due to an effective reduction of the number of Fourier modes and 
the smearing of the underlying true clustering. 



extract the optimal BAO signal from photometric surveys. 
These more detailed questions are deferred to a later paper. 

In the upper panels of Fig. 17 we show the mean, spheri- 
cally averaged power spectrum of the dark matter measured 
from the l-basicc simulations at z = 0.5, along with its 
variance, in photo-z space (solid blue lines). The size of the 
photo-z errors are = 0.01 and = 0.04 (equivalent to 
15.8 and 63.4 ft"^Mpc at z = 0.5) in the left- and right- 
hand panels respectively. We have also plotted the power 
spectrum measured in redshift-space (solid red lines) and 
the analytical expression of Eq. 11 (dashed red line). By 
comparing the spectra in redshift and photo-z spaces, the 
additional damping described above is evident. Also, we see 
that Eq. 11 describes quantitatively this extra damping on 
scales where the power spectrum is not shot-noise domi- 
nated. 

In the lower panels of Fig. 17 we take a closer look at 
the BAO by isolating them from the large-scale shape of the 
power spectrum. We do this by dividing the power spectrum 
by a smoothed version of the measurement. It is clear that 
since the number of "noisy modes" increases with the size of 
the photometric redshift errors, the error on the power spec- 
trum and therefore on the BAO also increases. The visibility 
of the higher harmonic BAO is also reduced as the photo- 
metric redshift error increases. In order to quantify the loss 
of information, we have followed a standard technique to 
measure BAO as described in Angulo et al. (2008) (see also 
Percival et al. 2007 and Sanchez et al. 2008). The method 
basically consists of dividing the measured power spectrum 



by a smoothed version of the measurement. In this way, any 
long wavelength gradient or distortion in the shape of the 
power spectrum is removed which diminishes the impact of 
possible systematic errors due to redshift space distortions, 
galaxy bias, nonlinear evolution and, in the case described 
in this paper, photometric redshift distortions. Then, we 
construct a model ratio using linear perturbation theory, 
fit/-Psmooth, which we fit to the measured ratios. In the fit- 
ting procedure there are two free parameters: (i) a damping 
factor to account for the destruction of BAO peaks located 
at high k by non-linear effects and redshift-space distortions 
and (ii) a stretch factor, a, which quantifies how accurately 
we can measure the BAO wavelength. The latter gives a sim- 
ple estimate of how well we can constrain the dark energy 
equation of state from BAO measurements alone. 

Fig 18 shows the results of applying our fitting pro- 
cedure to the L-BASICC ensemble at different redshifts. On 
the a;-axis we plot the size of the photometric redshift er- 
ror divided by (1 -f z), whilst on the j/-axis we plot the 
predicted error on a divided by the error we infer for an 
ideal spectroscopic survey (i.e. from the power spectrum in 
redshift-space). Since the error on a scales with the error 
on the power spectrum and the latter is proportional to the 
square root of the volume of the survey, the {/-axis should be 
roughly equal to the square root of the factor by which the 
volume of a photometric redshift needs to be larger than 
the volume of a spectroscopic survey to achieve the same 
accuracy. 

Several authors have investigated the implications of 
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photometric redshift errors on the clustering measurements 
in general and on the BAO in particular (Seo & Eisenstein 
2003; Amendola et al. 2005; Dolney et al. 2006; Blake & Bri- 
dle 2005). Our analysis improves upon these studies in sev- 
eral ways: (i) we have included photometric redshift errors 
directly into an realistic distribution of objects; (i) by using 
N-body simulations, our calculation takes into account the 
effects introduced by nonlinear evolution, nonlinear redshift- 
space distortions and shot noise; (iii) the use of 50 different 
simulations enables a robust and realistic estimation of the 
errors on the power spectrum measurements; (iv) we have 
investigated how our results change if we use the actual dis- 
tribution of photometric redshift errors (the cyan and brown 
circles in Fig 18), instead of a Gaussian fit and we find only 
a small additional degradation. 

These improvements lead to predictions that are some- 
what different from previous ones. For example, for Az — 
0.03, Blake & Bridle (2005) predict a factor of ~10 for the 
reduction of the effective volume of a photometric survey. 
Here, as shown in Fig. 18, we find a reduction which is a 
factor 2 times smaller than this (i.e. a volume reduction 
factor of ~5). The main difference between our analyses 
is that Blake & Bridle (2005) use only modes larger than 
fcmax = 2/(72, arguing that wavelengths shorter than the 
size of the photometric redshift errors contribute only noise. 
In reality, there is a smooth transition around fcmax, with 
signal coming from all wavenumbers (with different weight- 
ing, of course). In addition, the neglect of nonlinear evolu- 
tion (which erases the BAO at high wavenumbers) also con- 
tributes to Blake & Bridle (2005) overestimating the reduc- 
tion in effective volume. These two effects together explain 
the disagreement between our results. 



6 DISCUSSION AND CONCLUSIONS 

We have described a method for constructing mock galaxy 
catalogues which are well suited to aid in the preparation, 
and eventually in the interpretation of large photometric 
surveys. We applied our mock catalogues specifically to the 
data that will shortly begin to be collected with PSl, the 
first of the 4 telescopes planned for the Pan-STARRS sys- 
tem. 

Our mock catalogue building method relies on the use of 
two complementary theoretical tools: cosmological N-body 
simulations and a semi-analytic model of galaxy formation. 
For this study, we have employed the Millennium N-body 
simulations of Springel et al. (2005) together with galaxy 
properties calculated using the GALFORM model with the 
physics described by Bower et al. (2006). Although this 
model gives quite a good match to the local galaxy lumi- 
nosity function in the B- and K-band, we refined the match 
by applying a small correction of 0.15 mag to the luminosi- 
ties of all galaxies, so that the agreement with the SDSS 
luminosity function is excellent. Similarly, we applied a cor- 
rection to the predicted galaxy sizes as a function of redshift 
in order to match the SDSS distribution of Petrosian half- 
light radii in the r-band. As a simple test, we showed that 
our galaxy formation model agrees very well with the r-band 
number counts in the SDSS Commissioning and deep2 data 
over a range of 12 magnitudes. 

We adopt a similar magnitude system as the SDSS, 
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Figure 18. The ratio of the error on the measurement of the 
BAO scale in photo-z space to that in redshift space (i.e. from 
a perfect spectroscopic redshift) as a function of the magnitude 
of the photometric redshift error. Assuming that the error on the 
measurement scales with the square root of the volume, then the 
j/-axis gives the square root of the ratio of volumes of photometric 
to spectroscopic surveys which achieve the same accuracy in the 
measurement of the BAO scale. Note that this quantity is inde- 
pendent of the redshift at which the measurement is made, i.e. it 
is independent of the degree of nonlinearity present in the dark 
matter distribution. The cyan and brown circles give the results 
from using the actual distribution of photometric redshift errors, 
while the others assume a Gaussian error distribution shown in 
Fig. 16. 

based on the use of Petrosian magnitudes and use these to 
calculate the expected magnitude limits for extended objects 
in the two surveys that PSl will undertake, the "37r" survey 
and the MDS. We find that, after 3 years, the 37r survey will 
have detected ~ 2 x 10* galaxies in all 5 photometric bands 
{g, r, i, z and y) , with a peak in the redshift distribution of 
~0.5 and an extended tail containing about 10 million galax- 
ies with z > 0.9. The MDS will detect ~ 2 x 10^ galaxies, 
the redshift distribution peaking at 2 ~ 0.8, with 0.5 mil- 
lion galaxies lying at z > 2. Of the 5 PSl bands y is the 
shallowest and removing the requirement that a galaxy be 
detected in this band more than doubles the total numbers 
in the sample. 

We have used our mock catalogues to take a first look at 
the accuracy of photometric redshifts in the PSl photomet- 
ric system. Photometric redshifts can be readily estimated 
using the public Hyper-z code of Bolzonella et al. (2000). 
With the PSl grizy bands alone it is possible, in principle, 
to achieve an accuracy in the 37r survey of Az/ {1 + z) ~ 0.06 
in the range 0.25 < z < 0.8. This could be reduced to ~0.05 
by adding J and K photometry from the UKIDDS (LAS) 
and could be improved even further with a hypothetical U 
band survey to 23 mag, although the samples become pro- 
gressively smaller as these additional bands are added. Cut- 
ting at the relatively shallow depth of the y-band is impor- 
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tant in achieving these errors. Going deeper than the jy-band 
data would increase the sample size substantially, but the er- 
rors would increase to ~0.075. There is therefore a balance 
to be struck between reducing the sample size (by about a 
factor of 2) which increases the accuracy of all the photom- 
etry, allows the j/-band to be used and has the combined 
effect of increasing the photometric redshift accuracy. For 
the MDS an accuracy of Az/(1 + ~ 0.05 is achievable for 
0.02 < z < 1.5 using the PSl bands alone, with similar frac- 
tional improvements as for the Stt survey by the inclusion of 
U and near infrared bands. 

A dramatic improvement in photometric redshift accu- 
racy can be achieved for samples containing only red galax- 
ies. We have shown that it should be possible to identify 
a red sample (i.e. with red rest-frame g — r colours) di- 
rectly from the photometric data using the best-fit Hyper-a 
templates. These samples can still contain large numbers 
of galaxies. For example, an accuracy of Az/{1 + z) ^ 
0.02—0.04 may be achievable for ~100 million red galaxies at 
0.4 < z < 1.1 in the 3n survey. Similarly, for the MDS, this 
sort of accuracy could be achieved for ~30 million galaxies 
at 0.4 < z < 2. These estimates are all based on the "off-the- 
shelf" Hyper-z code, without any tuning of the code for the 
PSl setup. We expect that further improvements should be 
possible by refining the photometric redshift estimator and 
tailoring it specifically to the PSl bands. 

Our analysis is based on the use of the GALFORM semi- 
analytic galaxy formation model. Although this model gives 
a good match to a large range of observed galeixy properties, 
it is based on a number of approximations and has uncer- 
tain elements which could be relevant to the estimation of 
photometric redshifts. These include the effects of redden- 
ing, assumptions about the frequency and duration of bursts 
and the use of the Bruzual & Chariot (1993) stellar pop- 
ulation synthesis libraries which are the same as assumed 
in our implementation of Hyper-z. We note that the star 
formation histories predicted by the model are much more 
varied and have a richer structure than those assumed to 
construct the Hyper-2; templates, and that the treatment of 
dust extinction is very different in galform. Abdalla et al. 
(2008) carried out a similar study to ours and reached sim- 
ilar conclusions about the size of the photometric redshift 
errors and the usefulness of additional filters in the NIR or 
far-UV. This is encouraging as Abdalla et al. used a com- 
pletely different photometric redshift estimator, ANNz, an 
artificial neural network code written by Collister & Lahav 
(2004). Furthermore, instead of using a galaxy formation 
model to generate a mock catalogue, these authors used a 
mixture of empirical and theoretical techniques to produce 
a set of galaxies on which to test their estimator. 

One of the main applications of the PSl Stt survey will 
be to the determination of the scale of baryonic acoustic os- 
cillations used to constrain the properties of the dark energy. 
We have investigated how uncertainties in the photometric 
redshifts will degrade the determination of the BAO scale 
and, in particular, we have quantified the factor by which 
the effective volume of a photometric survey is reduced by 
these uncertainties. We find that, with the sorts of photo- 
metric redshift uncertainties that we have estimated for a 
red sample, PSl will achieve the same accuracy as a spec- 
troscopic galaxy survey containing 1/5 as many galaxies. 
Unfortunately, spectroscopy for 20 million galaxies at ~ 1 



is not likely to be feasible for some time. PSl should be able 
to provide competitive estimates of the BAO scale in the 
next few years. 
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